-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bspline #39
Bspline #39
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several comments, largely on inheritance structure and documentation. Some bigger points:
- in general, as mentioned in Docstring/signature and inheritance #30, having
evaluate
come fromBasis
and the subclasses implement_evaluate
gets confusing, because the docstrings for the function we want users to call is generic. might be worth revisiting whether the super class hasevaluate
or_evaluate
(which the subclasses call). The other way to do this would be to remove theeval_basis = self._evaluate(*xi)
fromBasis.evaluate()
and have the subclasses callsuper().evaluate()
to get the checks, but then I don't know whether we can then require users to implementevaluate
(maybe we can, just by marking it as@abstract
, but I'm not sure). Out of scope for this PR, but it's making me even more inclined to thinking we should change this. If you'd like, I can try it out while you're working on the GLM class (it shouldn't change the interface at all). - For all spline bases, does
knot_locs
need to be an attribute (or at least, a public one)? seems like_generate_knots
is always called when it's needed. And we don't expect the user to use it, right? so probably shouldn't be an attribute - I don't love the tests (e.g.,
test_minimum_number_of_basis_required_is_matched
for either BSpline) that have a long boolean expression asraise_exception
and then a bunch of ORs in thematch
forpytest.raises
-- it feels like we're losing the specificity of the test (and the readability), since I don't know which error is expected in which condition. the alternative would be to split them up into separate tests or to have anif
block defining the match string. both are long, but I think the second is worth doing, since it seems like a decent tradeoff between readability and compactness. Thoughts?
src/neurostatslib/basis.py
Outdated
def _check_samples_non_empty(sample_pts): | ||
if sample_pts.shape[0] == 0: | ||
raise ValueError( | ||
"Empty sample array provided. At least one sample is required for evaluation!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this here and only for BSpline?
from the tests, looks like you're expecting it when evaluate_on_grid(x)
for x<=0
, but is that really an issue? for the negative case, we'll get an error in np.linspace
and for x=0
, you'd get an empty array otherwise, which seems reasonable. Is there a specific reason to check for this?
and in general, don't like that we'll have different behavior for BSpline than for the rest, when really, it's the same concerns, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't check, with x=0 you get an "invalid input data" error message from the scipy splev call.
I am fine with that but:
- it is a bit nested and less clear for a user what may have caused it.
- MSpline, RaisedCosineLog, RaisedCosineLinear returns an empty array, not an error.
- OrthExp raises another class specific error, which is that you need at least as many samples as basis function to define the basis (otherwise the ortonormalization would be not-defined).
Uniformity in behavior is hard to get given that different basis construction have different requirements.
My rationale here was try to surface the error and give a basis-type specific explanation of the error which is understandable.
If we try to get something that looks more uniform there are a few way:
- return an empty array for B-spline too instead of calling splev in case of empty input
- raise the error instead of returning an empty array for all of the basis function we have
Which one do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I'm inclined to work towards uniformity in behavior, and I don't have a strong preference here, but I think raising the error for all of them is the right way? because if someone calls evaluate_on_grid(0)
, they've really misunderstood what it does. right?
Co-authored-by: William F. Broderick <[email protected]>
Co-authored-by: William F. Broderick <[email protected]>
Co-authored-by: William F. Broderick <[email protected]>
…near-models into bspline
@@ -94,6 +94,11 @@ def evaluate(self, *xi: NDArray) -> NDArray: | |||
------- | |||
: | |||
The generated basis functions. | |||
Raises |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there's supposed to be an empty line here
src/neurostatslib/basis.py
Outdated
def _check_samples_non_empty(sample_pts): | ||
if sample_pts.shape[0] == 0: | ||
raise ValueError( | ||
"Empty sample array provided. At least one sample is required for evaluation!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I'm inclined to work towards uniformity in behavior, and I don't have a strong preference here, but I think raising the error for all of them is the right way? because if someone calls evaluate_on_grid(0)
, they've really misunderstood what it does. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just remove the one commented-out line, and let's discuss the uniformity of behavior for evaluate_on_grid(0)
.
Like Guillaume mentioned, I think these tests are really hard to understand, but that seems out of scope here.
src/neurostatslib/basis.py
Outdated
The evaluation is performed by looping over each element and using `splev` | ||
from SciPy to compute the basis values. | ||
""" | ||
super()._check_samples_non_empty(sample_pts) | ||
# super()._check_samples_non_empty(sample_pts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not using this, should remove this line
Added BSplineBasis and CyclicBSplineBasis to the basis module